AITopics | Babil Governorate

Collaborating Authors

Babil Governorate

Text Generation Models for Luxembourgish with Limited Data: A Balanced Multilingual Strategy

Plum, Alistair, Ranasinghe, Tharindu, Purschke, Christoph

arXiv.org Artificial IntelligenceDec-20-2024

This paper addresses the challenges in developing language models for less-represented languages, with a focus on Luxembourgish. Despite its active development, Luxembourgish faces a digital data scarcity, exacerbated by Luxembourg's multilingual context. We propose a novel text generation model based on the T5 architecture, combining limited Luxembourgish data with equal amounts, in terms of size and type, of German and French data. We hypothesise that a model trained on Luxembourgish, German, and French will improve the model's cross-lingual transfer learning capabilities and outperform monolingual and large multilingual models. To verify this, the study at hand explores whether multilingual or monolingual training is more beneficial for Luxembourgish language generation. For the evaluation, we introduce LuxGen, a text generation benchmark that is the first of its kind for Luxembourgish.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.09415

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Germany > Saxony > Leipzig (0.05)
(16 more...)

Genre: Research Report (1.00)

Industry: Media > News (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Neural Text Normalization for Luxembourgish using Real-Life Variation Data

Lutgen, Anne-Marie, Plum, Alistair, Purschke, Christoph, Plank, Barbara

arXiv.org Artificial IntelligenceDec-13-2024

Orthographic variation is very common in Luxembourgish texts due to the absence of a fully-fledged standard variety. Additionally, developing NLP tools for Luxembourgish is a difficult task given the lack of annotated and parallel data, which is exacerbated by ongoing standardization. In this paper, we propose the first sequence-to-sequence normalization models using the ByT5 and mT5 architectures with training data obtained from word-level real-life variation data. We perform a fine-grained, linguistically-motivated evaluation to test byte-based, word-based and pipeline-based models for their strengths and weaknesses in text normalization. We show that our sequence model using real-life variation data is an effective approach for tailor-made normalization in Luxembourgish.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.09383

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Asia > Middle East > Iraq > Babil Governorate > Hillah (0.04)
(12 more...)

Genre: Research Report (0.40)

Industry: Education (0.68)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Add feedback

Mixture-of-PageRanks: Replacing Long-Context with Real-Time, Sparse GraphRAG

Alonso, Nicholas, Millidge, Beren

arXiv.org Artificial IntelligenceDec-8-2024

Recent advances have extended the context window of frontier LLMs dramatically, from a few thousand tokens up to millions, enabling entire books and codebases to fit into context. However, the compute costs of inferencing long-context LLMs are massive and often prohibitive in practice. RAG offers an efficient and effective alternative: retrieve and process only the subset of the context most important for the current task. Although promising, recent work applying RAG to long-context tasks has two core limitations: 1) there has been little focus on making the RAG pipeline compute efficient, and 2) such works only test on simple QA tasks, and their performance on more challenging tasks is unclear. To address this, we develop an algorithm based on PageRank, a graph-based retrieval algorithm, which we call mixture-of-PageRanks (MixPR). MixPR uses a mixture of PageRank-based graph-retrieval algorithms implemented using sparse matrices for efficent, cheap retrieval that can deal with a variety of complex tasks. Our MixPR retriever achieves state-of-the-art results across a wide range of long-context benchmark tasks, outperforming both existing RAG methods, specialized retrieval architectures, and long-context LLMs despite being far more compute efficient. Due to using sparse embeddings, our retriever is extremely compute efficient, capable of embedding and retrieving millions of tokens within a few seconds and runs entirely on CPU.

arxiv preprint arxiv, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2412.06078

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > Middle East > Iraq > Babil Governorate (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

LuxBank: The First Universal Dependency Treebank for Luxembourgish

Plum, Alistair, Döhmer, Caroline, Milano, Emilia, Lutgen, Anne-Marie, Purschke, Christoph

arXiv.org Artificial IntelligenceNov-7-2024

The Universal Dependencies (UD) project has significantly expanded linguistic coverage across 161 languages, yet Luxembourgish, a West Germanic language spoken by approximately 400,000 people, has remained absent until now. In this paper, we introduce LuxBank, the first UD Treebank for Luxembourgish, addressing the gap in syntactic annotation and analysis for this `low-research' language. We establish formal guidelines for Luxembourgish language annotation, providing the foundation for the first large-scale quantitative analysis of its syntax. LuxBank serves not only as a resource for linguists and language learners but also as a tool for developing spell checkers and grammar checkers, organising existing text archives and even training large language models. By incorporating Luxembourgish into the UD framework, we aim to enhance the understanding of syntactic variation within West Germanic languages and offer a model for documenting smaller, semi-standardised languages. This work positions Luxembourgish as a valuable resource in the broader linguistic and NLP communities, contributing to the study of languages with limited research and resources.

guideline, luxbank, luxembourgish, (15 more...)

arXiv.org Artificial Intelligence

2411.04813

Country:

Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.05)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Asia > Middle East > Iraq > Babil Governorate > Hillah (0.04)
(8 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data

Maekawa, Seiji, Iso, Hayate, Bhutani, Nikita

arXiv.org Artificial IntelligenceOct-15-2024

The rapid increase in textual information means we need more efficient methods to sift through, organize, and understand it all. While retrieval-augmented generation (RAG) models excel in accessing information from large document collections, they struggle with complex tasks that require aggregation and reasoning over information spanning across multiple documents--what we call holistic reasoning. Long-context language models (LCLMs) have great potential for managing large-scale documents, but their holistic reasoning capabilities remain unclear. In this work, we introduce HoloBench, a novel framework that brings database reasoning operations into text-based contexts, making it easier to systematically evaluate how LCLMs handle holistic reasoning across large documents. Our approach adjusts key factors such as context length, information density, distribution of information, and query complexity to evaluate LCLMs comprehensively. Our experiments show that the amount of information in the context has a bigger influence on LCLM performance than the actual context length. Furthermore, the complexity of queries affects performance more than the amount of information, particularly for different types of queries. Interestingly, queries that involve finding maximum or minimum values are easier for LCLMs and are less affected by context length, even though they pose challenges for RAG systems. However, tasks requiring the aggregation of multiple pieces of information show a noticeable drop in accuracy as context length increases. Additionally, we find that while grouping relevant information generally improves performance, the optimal positioning varies across models. Our findings surface both the advancements and the ongoing challenges in achieving a holistic understanding of long contexts.

information, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.11996

Country:

North America > United States > California > Sonoma County (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(11 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Transportation > Infrastructure & Services > Airport (1.00)
Transportation > Air (1.00)
Consumer Products & Services (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Comparison of Epilepsy Induced by Ischemic Hypoxic Brain Injury and Hypoglycemic Brain Injury using Multilevel Fusion of Data Features

Kadem, Sameer, Sami, Noor, Elaraby, Ahmed, Alyousif, Shahad, Jalil, Mohammed, Altaee, M., Almusawi, Muntather, Ismaeel, A. Ghany, Kareem, Ali Kamil, Kamalrudin, Massila, ftaiet, Adnan Allwi

arXiv.org Artificial IntelligenceSep-3-2024

The study aims to investigate the similarities and differences in the brain damage caused by Hypoxia-Ischemia (HI), Hypoglycemia, and Epilepsy. Hypoglycemia poses a significant challenge in improving glycemic regulation for insulin-treated patients, while HI brain disease in neonates is associated with low oxygen levels. The study examines the possibility of using a combination of medical data and Electroencephalography (EEG) measurements to predict outcomes over a two-year period. The study employs a multilevel fusion of data features to enhance the accuracy of the predictions. Therefore this paper suggests a hybridized classification model for Hypoxia-Ischemia and Hypoglycemia, Epilepsy brain injury (HCM-BI). A Support Vector Machine is applied with clinical details to define the Hypoxia-Ischemia outcomes of each infant. The newborn babies are assessed every two years again to know the neural development results. A selection of four attributes is derived from the Electroencephalography records, and SVM does not get conclusions regarding the classification of diseases. The final feature extraction of the EEG signal is optimized by the Bayesian Neural Network (BNN) to get the clear health condition of Hypoglycemia and Epilepsy patients. Through monitoring and assessing physical effects resulting from Electroencephalography, The Bayesian Neural Network (BNN) is used to extract the test samples with the most log data and to report hypoglycemia and epilepsy Keywords- Hypoxia-Ischemia , Hypoglycemia , Epilepsy , Multilevel Fusion of Data Features , Bayesian Neural Network (BNN) , Support Vector Machine (SVM)

brain injury, hypoglycemia, practice and application, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.54216/FPA.100106

2409.02957

Country:

Asia > Middle East > Iraq > Baghdad Governorate > Baghdad (0.04)
Asia > Middle East > Saudi Arabia > Al-Qassim Province > Buraydah (0.04)
Europe > United Kingdom (0.04)
(6 more...)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.87)

Industry: Health & Medicine > Therapeutic Area > Neurology > Epilepsy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.98)

Add feedback

The $\mu\mathcal{G}$ Language for Programming Graph Neural Networks

Belenchia, Matteo, Corradini, Flavio, Quadrini, Michela, Loreti, Michele

arXiv.org Artificial IntelligenceJul-12-2024

Graph neural networks form a class of deep learning architectures specifically designed to work with graph-structured data. As such, they share the inherent limitations and problems of deep learning, especially regarding the issues of explainability and trustworthiness. We propose $\mu\mathcal{G}$, an original domain-specific language for the specification of graph neural networks that aims to overcome these issues. The language's syntax is introduced, and its meaning is rigorously defined by a denotational semantics. An equivalent characterization in the form of an operational semantics is also provided and, together with a type system, is used to prove the type soundness of $\mu\mathcal{G}$. We show how $\mu\mathcal{G}$ programs can be represented in a more user-friendly graphical visualization, and provide examples of its generality by showing how it can be used to define some of the most popular graph neural network models, or to develop any custom graph processing application.

expression, graph neural network, neural network, (12 more...)

arXiv.org Artificial Intelligence

2407.09441

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > Italy (0.04)
(9 more...)

Genre:

Research Report (0.64)
Overview (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ProMoAI: Process Modeling with Generative AI

Kourani, Humam, Berti, Alessandro, Schuster, Daniel, van der Aalst, Wil M. P.

arXiv.org Artificial IntelligenceApr-29-2024

ProMoAI is a novel tool that leverages Large Language Models (LLMs) to automatically generate process models from textual descriptions, incorporating advanced prompt engineering, error handling, and code generation techniques. Beyond automating the generation of complex process models, ProMoAI also supports process model optimization. Users can interact with the tool by providing feedback on the generated model, which is then used for refining the process model. ProMoAI utilizes the capabilities LLMs to offer a novel, AI-driven approach to process modeling, significantly reducing the barrier to entry for users without deep technical knowledge in process modeling.

llm, process model, springer, (14 more...)

arXiv.org Artificial Intelligence

2403.04327

Country:

Europe > Austria > Vienna (0.14)
Asia > Middle East > Iraq > Babil Governorate > Hillah (0.05)
Africa > Rwanda > Kigali > Kigali (0.05)
(9 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.52)

Add feedback

LTL under reductions with weaker conditions than stutter-invariance

Paviot-Adet, Emmanuel, Poitrenaud, Denis, Renault, Etienne, Thierry-Mieg, Yann

arXiv.org Artificial IntelligenceJan-6-2023

Verification of properties expressed as-regular languages such as LTL can benefit hugely from stutter-insensitivity, using a diverse set of reduction strategies. However properties that are not stutter-insensitive, for instance due to the use of the neXt operator of LTL or to some form of counting in the logic, are not covered by these techniques in general. We propose in this paper to study a weaker property than stutter-insensitivity. In a stutter insensitive language both adding and removing stutter to a word does not change its acceptance, any stuttering can be abstracted away; by decomposing this equivalence relation into two implications we obtain weaker conditions. We define a shortening insensitive language where any word that stutters less than a word in the language must also belong to the language. A lengthening insensitive language has the dual property. A semi-decision procedure is then introduced to reliably prove shortening insensitive properties or deny lengthening insensitive properties while working with a reduction of a system. A reduction has the property that it can only shorten runs. Lipton's transaction reductions or Petri net agglomerations are examples of eligible structural reduction strategies. An implementation and experimental evidence is provided showing most nonrandom properties sensitive to stutter are actually shortening or lengthening insensitive. Performance of experiments on a large (random) benchmark from the model-checking competition indicate that despite being a semi-decision procedure, the approach can still improve state of the art verification tools.

artificial intelligence, logic & formal reasoning, reduction, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-08679-3_11

2111.03342

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.47)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.91)

Add feedback

Modeling Effect of Lockdowns and Other Effects on India Covid-19 Infections Using SEIR Model and Machine Learning

Sampath, Sathiyanarayanan, Bose, Joy

arXiv.org Artificial IntelligenceOct-4-2021

The SEIR model is a widely used epidemiological model used to predict the rise in infections. This model has been widely used in different countries to predict the number of Covid-19 cases. But the original SEIR model does not take into account the effect of factors such as lockdowns, vaccines, and re-infections. In India the first wave of Covid started in March 2020 and the second wave in April 2021. In this paper, we modify the SEIR model equations to model the effect of lockdowns and other influencers, and fit the model on data of the daily Covid-19 infections in India using lmfit, a python library for least squares minimization for curve fitting. We modify R0 parameter in the standard SEIR model as a rectangle in order to account for the effect of lockdowns. Our modified SEIR model accurately fits the available data of infections.

india, infection, seir model, (13 more...)

arXiv.org Artificial Intelligence

2110.03422

Country:

Asia > Middle East > Iraq > Babil Governorate (0.05)
Asia > India > Karnataka > Bengaluru (0.05)
Europe > United Kingdom > Wales (0.04)
Europe > United Kingdom > England (0.04)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback